Configurable Microprocessor Implementation of Low Bit Rate Audio Decoding

نویسنده

  • Gary S. Brown
چکیده

Using a configurable microprocessor to implement low-bit-rate audio applications by tailoring the instruction set reduces algorithm complexity and implementation cost. As an example, this paper describes a Dolby Digital (AC-3) decoder implementation that uses a commercially-available configurable microprocessor to achieve 32bit floating-point precision while minimizing the required processor clock rate and die size. This paper focuses on how the audio quality and features of the reference decoder algorithm dictate the customization of the microprocessor. This paper shows examples of audio specific extensions to the processor's instruction set to create a family of AC-3 decoder implementations that meet multiple performance and cost points. How this approach benefits other audio applications is also discussed. INTRODUCTION Even in an ideal world, audio algorithm designers can't ignore implementation details. The choice of implementation method, driven by a variety of marketing factors, can adversely affect audio performance, features, and quality. Two general cases serve as an example: software implementation and custom chip implementation. First, a software-only implementation using a programmable off-shelf microprocessor offers flexibility and also makes sense to meet development time requirements. However, a programmable approach to audio implementation relies on hardware with general resources targeted at a wide range of algorithms. A dual multiplier-accumulator (MAC) and a barrel-shifter are examples of hardware available in programmable DSP processors used in software-only implementations. In choosing a software-only approach there is a cost in performance and quality due to the forsaken choice of an application-specific hardware implementation tuned to the particular requirements of the algorithms to be implemented. BROWN AUDIO DECODER FOR CONFIGURABLE PROCESSOR In the second case, a custom chip design typically meets application execution speed and cost requirements. A custom hardware approach lends itself to improved performance and audio features supported in application-specific hardware at the cost of lengthier development time. This is the route taken by many vendors of high-volume audio products. There are several common implementation techniques for low-bit-rate audio coders. Examples are stand-alone general purpose audio DSPs and custom-designed application specific ICs (ASICs), as well as synthesizable microprocessor cores integrated into a System On Chip (SOC). Recently, a trend in SOC design toward more features has forced chip cost to have priority over algorithm function and quality. This paper will examine an audio implementation technique that facilitates meeting the performance, feature, and quality needs of the audio algorithm through tailoring the instruction set of a microprocessor. As an example, the Dolby Digital (AC-3) reference algorithm [1] was implemented by Tensilica on its commercially-available configurable microprocessor. [2] This paper describes some of the new instructions designed as additions to the processor's instruction set. These instruction extensions demonstrate the improved cycle-count performance that can be obtained while meeting the required functionality and quality of the reference algorithm. Tailoring a configurable processor for low-bit-rate audio decoding demonstrates an implementation technique that is tuned to such marketing needs as performance and cost while meeting algorithm function and quality requirements. Finally, this paper will describe how this implementation method is significant in cost, performance, and audio quality in general low-bit-rate audio coding applications. TAILORED MICROPROCESSORS FOR AUDIO Previous work in customizing audio processors includes a proprietary audio DSP processor core used in an integrated audio/video decoder IC [3] as well as use of a licensed DSP core integrated alongside custom hardware extensions for audio coding performance and improved precision [4]. Such custom designs offer the benefits of improved precision and processor efficiency (i.e. lower clock rate) over a software-only approach while allowing for programmability. However, a configurable processor core has the additional facility to extend the instruction set with user-designed instructions. FEATURES OF A CONFIGURABLE PROCESSOR Figure 1 shows one of the possible architectures using a configurable microprocessor. A custom chip based on a configurable processor typically includes a set of fixed features; in the diagram they are labeled Base Instruction Set Architecture (ISA) Features. Various optional functions, such as a floating-point unit (FPU) or 16-bit multiplier (MAC16) can be added to the standard processor. These options have been pre-verified by the vendor of the configurable processor. The processor also typically provides several standard bus and memory interfaces such as the processor interface (PIF) to system memory or a local memory interface (LMI) to single-cycle access memory units. Most importantly, the system designer can add custom logic. The designer-defined register files and execution units shown in Figure 1 are examples of such custom logic. In some configurable processors, this custom logic becomes tightly integrated with the standard processor. For example, the processor has opcodes reserved for correspondence with userdefined custom instructions. In addition, the software tools provided by the manufacturer of the configurable processor (C compiler, instruction-set simulator, debugger, linker, and so on) are automatically adapted to incorporate the new opcodes and custom instruction logic. SYSTEM REQUIREMENTS In order to tailor a configurable processor with new instructions to meet system needs, a reasonable set of system requirements is necessary. For the AC-3 multichannel decoder algorithm, the following requirements are sufficient: 1. output precision sufficient to achieve over 110 dB dynamic range at output 2. real-time decoding at less than 40 million instructions per second (MIPS) for worstcase bitstreams 3. all features/functions in source code to be implemented 4. C code only implementation (i.e. no lowlevel assembly) AES 113 CONVENTION, LOS ANGELES, CA, USA, 2002 OCTOBER 5–8 2 BROWN AUDIO DECODER FOR CONFIGURABLE PROCESSOR

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Capacity Approaching Codes , Iterative Decoding Algorithms , and Their Applications

Implementation constraints on iterative decoders applying message-passing algorithms are investigated. Serial implementations similar to traditional microprocessor datapaths are compared against architectures with multiple processing elements that exploit the inherent parallelism in the decoding algorithm. Turbo codes and low-density parity check codes, in particular, are evaluated in terms of ...

متن کامل

Search Based Weighted Multi-Bit Flipping Algorithm for High-Performance Low-Complexity Decoding of LDPC Codes

In this paper, two new hybrid algorithms are proposed for decoding Low Density Parity Check (LDPC) codes. Original version of the proposed algorithms named Search Based Weighted Multi Bit Flipping (SWMBF). The main idea of these algorithms is flipping variable multi bits in each iteration, change in which leads to the syndrome vector with least hamming weight. To achieve this, the proposed algo...

متن کامل

Search Based Weighted Multi-Bit Flipping Algorithm for High-Performance Low-Complexity Decoding of LDPC Codes

In this paper, two new hybrid algorithms are proposed for decoding Low Density Parity Check (LDPC) codes. Original version of the proposed algorithms named Search Based Weighted Multi Bit Flipping (SWMBF). The main idea of these algorithms is flipping variable multi bits in each iteration, change in which leads to the syndrome vector with least hamming weight. To achieve this, the proposed algo...

متن کامل

Iterative Decoder Architectures

Implementation constraints imposed on iterative decoders applying message-passing algorithms are investigated. Serial implementations similar to traditional microprocessor datapaths are compared against architectures with multiple processing elements that exploit the inherent parallelism in the decoding algorithm. Turbo codes and low-density parity check codes, in particular, are evaluated in t...

متن کامل

Using On-Chip Configurable Logic to Reduce Embedded System Software Energy

We examine the energy savings possible by re-mapping critical software loops from a microprocessor to configurable logic appearing on the same-chip in commodity chips now commercially available. That logic is typically intended to implement peripherals and coprocessors without increasing chip count – but we show that reduced software energy is an additional benefit, making such chips even more ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002